Cory Whitney
Open RStudio
type ‘?’ in R console with function, package or data name
Add “R” to a search with a copy of an error message
Help > Cheatsheets > Data Visualization with ggplot2
Load data
participants_data <- read.csv("participants_data.csv")
R has several systems for making graphs
plot() functionplot(participants_data$academic_parents)
Bar plot of number of observations of binary data related to academic parents
plot() functionplot(participants_data$academic_parents, participants_data$days_to_email_response)
Boxplot of days to email response grouped by binary data related to academic parents
Many libraries and functions for graphs in R…
ggplot2 is one of the most elegant and most versatile.
ggplot implements the grammar of graphics to describe and build graphs.
Do more and do it faster by learning one system and applying it in many places.
Learn more about ggplot2 in “The Layered Grammar of Graphics”
Example from your data
library(ggplot2)
ggplot(data = participants_data, aes(x=letters_in_first_name, y=days_to_email_response)) +
geom_point()
Scatterplot of days to email response as a function of the letters in your first name
Want to understand how all the pieces fit together? See the R for Data Science book: http://r4ds.had.co.nz/
ggplot(data = participants_data, aes(x=letters_in_first_name, y=days_to_email_response, color=academic_parents, size=working_hours_per_day)) +
geom_point()
Scatterplot of letters in your first name as a function of days to email response with colors representing binary data related to academic parents and working hours per day as bubble sizes.
Make more graphs
Example from Anderson's iris data set
ggplot(data=iris, aes(x=Sepal.Length, y=Petal.Length, color=Species, size=Petal.Width))+
geom_point()
Scatterplot of iris petal length as a function of sepal length with colors representing iris species and petal width as bubble sizes.
ggplot accepts formula arguments such as log
ggplot(data = diamonds, aes(x=carat, y=price, alpha = 0.2)) + geom_point()
ggplot(data = diamonds, aes(x=log(carat), y= log(price), alpha = 0.2)) + geom_point()
library(dplyr)
dsmall <- top_n(diamonds, n=100)
#Plot with different colors for color
ggplot(data = dsmall, aes(x=carat, y=price, color = color))+ geom_point()
#Plot with different shapes for cut
ggplot( data = dsmall, aes(carat, price, shape = cut)) + geom_point()
Set parameters manually with I() Inhibit Interpretation / Conversion of Objects
ggplot(data = diamonds, aes(carat, price, alpha=I(0.1), color=I("blue"))) + geom_point()
ggplot(data = diamonds, aes(carat, price, alpha=I(0.4), color=I("green"))) + geom_point()
With “geom” different types of plots can be defined e.g. points, line, boxplot, path, smooth. These can also be combined.
ggplot(data=dsmall, aes(x=carat, y=price))+
geom_point()+
geom_smooth()
geom_smooth() selects a smoothing method based on the data. Use method = to specify your preferred smoothing method.
ggplot(data=dsmall, aes(x=carat, y=price))+ geom_point()+ geom_smooth()
ggplot(data=diamonds, aes(x=carat, y=price))+ geom_point()+
geom_smooth(method = 'glm')
ggplot2 lines and smoothing options
geom_boxplot(). ggplot(data=diamonds, aes(x=color, y=price/carat)) +
geom_boxplot()
geom_jitter() show all points. ggplot(data=diamonds, aes(x=color, y=price/carat)) +
geom_boxplot()+
geom_jitter()
In case of overplotting changing alpha can help.
ggplot(data=diamonds, aes(x=color, y=price/carat, alpha=I(0.1))) +
geom_boxplot()+
geom_jitter()
ggplot(data = diamonds, aes(x=carat)) +
geom_density()
ggplot(data = diamonds, aes(x=carat, color = color)) +
geom_density()
ggplot(data = diamonds, aes(x=carat, color = color, alpha=I(0.3))) +
geom_density()
ggplot2 histograms
Use factor to subset your data.
ggplot(data = mpg, aes(x=displ, y=hwy, color = cyl))+
geom_point()+
geom_smooth(method="lm")
ggplot(data = mpg, aes(x=displ, y=hwy, color = factor(cyl)))+
geom_point()+
geom_smooth(method="lm")
ggplot2 subset with smooth line
for aes() in ggplot()
https://evamaerey.github.io/ggplot_flipbook/ggplot_flipbook_xaringan.html#1
ggplot code in non-slow fashion
ggplot(mtcars, aes(mpg, y = hp, col = gear)) +
geom_point() +
ggtitle("My Title") +
labs(x = "the x label", y = "the y label", col = "legend title")
'Slow ggplotting' version for the same plot
ggplot(data = mtcars) +
aes(x = mpg) +
labs(x = "the x label") +
aes(y = hp) +
labs(y = "the y label") +
geom_point() +
aes(col = gear) +
labs(col = "legend title") +
labs(title = "My Title")
https://evamaerey.github.io/ggplot_flipbook/ggplot_flipbook_xaringan.html#1
dplyr, ggplot2 and reshape2 part_data<-select_if(participants_data, is.numeric)
cormat <- round(cor(part_data), 1)
melted_cormat <- melt(cormat)
ggplot(data = melted_cormat, aes(x=Var1,
y=Var2, fill=value)) +
geom_tile()
png(file = "cortile.png", width = 7, height = 6, units = "in", res = 300)
ggplot(data = melted_cormat, aes(x = Var1, y = Var2, fill = value)) + geom_tile() + theme(axis.text.x = element_text(angle = 45, hjust = 1))
dev.off()
?pdf
datasauRus, ggplot2 and gganimate library(gganimate)
library(datasauRus)
ggplot(datasaurus_dozen, aes(x=x, y=y))+
geom_point()+
theme_minimal() +
transition_states(dataset, 3, 1) +
ease_aes('cubic-in-out')
tidyverse, ggplot2 and gganimateggplot(data = dsmall, aes(x = carat, y = price, color = color)) +
geom_line() +
transition_reveal(carat) +
ease_aes("linear") +
labs(title='Diamond carat: {frame_along}')
Test your new skills
Your turn to perform